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INTRODUCTION 


Normal  and  malignant  human  mammary  epithelial  cells  are  able  to  synthesize  and  to  respond  to 
various  different,  locally  acting  growth  factors  through  specific  receptors.  Among  these  are  the  type  1  family 
of  growth  factor  receptors,  which  consist  of  the  epidermal  growth  factor  receptor  (EGF-R),  ErbB2/Neu, 
ErbB3,  and  ErbB4  (1-6).  They  are  required  for  normal  mammary  development  and  lactation  and  are 
aberrantly  expressed  in  approximately  40%  of  breast  carcinomas.  Indeed,  in  human  breast  cancer  cases  the 
prognosis  of  a  patient  is  inversely  correlated  with  the  over  expression  and/or  amplification  of  this  receptor 
family.  The  physiological  regulatory  ligand  for  ErbB2/Neu  has  been  shown  to  be  heregulin  (7-9).  Interaction 
of  heregulin  with  the  ErbB3  induces  a  heterodimerization  between  ErbB2  and  ErbB3,  which  results  in  the 
transphosphorylation  and  activation  of  the  ErbB2  receptor.  Phosphorylation  of  this  receptor  initiates  signaling 
cascades,  which  in  turn  can  impact  upon  cell  function,  growth  and  division. 

One  way  to  regulate  protein  expression  in  the  cell  is  achieved  by  controlling  mRNA  concentration. 
Cytoplasmic  mRNA  levels  represent  a  balance  between  transcription,  splicing  and  nuclear  export  on  the  one 
hand  and  mRNA  degradation  on  the  other.  The  balance  between  these  two  processes  represents  a  major 
control  point  in  gene  expression.  Recent  studies  have  shown  that  post-transcriptional  mechanisms  regulate 
protein  expression  in  certain  cell  lines.  For  example,  down-regulation  of  the  proto-oncogene  c-myc  mRNA 
during  differentiation  of  C2C12  myoblasts  to  myotubes  is  mediated  by  a  cytoplasmic  mRNA  turnover  event 
rather  than  a  nuclear  processing  event  (10).  More  importantly,  Balmer  et  al  (11),  have  show  that  the  EGF- 
induced  up-regulation  of  EGF-R  mRNA  in  two  human  breast  cancer  cell  lines  that  over-express  EGF-R 
(MDA-MB-468  and  BT-20)  is  accompanied  by  stabilization  (>2-fold)  of  EGF-R  mRNA.  They  showed  that 
the  EGF-R  mRNA  contains  a  novel  complex  AU-rich  260-nt  cis-acting  destabilizing  element  in  the  3-UTR 
that  is  bound  by  specific  and  EGF-regulated  trans-acting  factors. 

Wilson  et  al.  (12-13)  have  identified  a  novel  nuclear  target  for  heregulin  signaling  which  responds  to 
the  growth  factor  treatment  of  cells  with  an  increase  ability  to  be  labeled  with  GTP.  They  identified  this  target 
as  the  20-kDa  subunit  of  the  nuclear  cap  binding  complex  (CBC)  and  demonstrated  that  the  CBC  is  stimulated 
to  bind  to  capped  RNAs  in  response  to  heregulin.  Based  on  these  observations  Wilson  et  al  suggested  that 
heregulin  could  impact  upon  cell  growth  by  modulating  gene  expression  at  the  level  of  RNA  processing  via 
the  CBC.  They  further  suggested  that  in  a  situation  where  the  heregulin  signal  is  constitutive,  the  active  CBC 
could  affect  gene  expression  by  amplifying  the  rate  of  RNA  processing,  and  thus  contribute  to  unregulated 
cell  growth  and  division. 

The  CBC  is  comprised  of  a  stable  heterodimer  between  an  18-kDa  subunit  and  a  90-kDa  subunit 
(CBP20  and  CBP80  respectively).  Biochemical  and  genetic  experiments  have  shown  important  roles  for  CBC 
in  mRNA  physiology  including  splicing  (14-17),  nuclear  export  (18-19),  3’  end  processing  (20),  translation 
initiation  (21),  nonsense  mediated  decay  (NMD)  (22)  and  mRNA  degradation  (23).  Due  to  its  active  role  in 
mRNA  stability  the  CBC  could  play  a  role  in  protein  expression,  Regulation  of  CBC  by  growth  factor 
signaling  may  represent  in  turn  a  mechanism  to  change  cytosolic  or  nuclear  protein  levels. 

We  have  solved  the  structure  of  the  CBC  at  2.2  A  resolution  by  molecular  replacement  (using  as  a 
model  a  partially  proteolyzed  CBC)  and  the  phases  from  a  Kr  MAD  dataset.  Mazza  et  al  (24)  had  previously 
published  the  structure  of  CBP80  and  a  proteolyzed  fragment  of  CBP20  (residues  38-1 16).  Due  to  proteolysis 
their  structure  could  not  answer  several  important  questions  such  as  how  the  cap  binds  to  CBP20  and  how 
CBP80  increases  the  affinity  of  CBP20  for  the  cap  structure.  The  atomic  structure  reported  here  comprises 
CBP80,  95%  of  CBP20  (residues  7-153)  and  the  cap  structure  analog  m7GpppG. 
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BODY 


1.  Experimental  Procedures 

Purification  and  crystallization  protocols  for  CBC  were  outlined  in  the  previous  report. 
Solving  the  phase  problem 

Table  1  summarizes  crystallographic  and  phasing  data 
Molecular  Replacement 

Model:  CBP80  from  Mazza  et  al.  (24)  and  a  fragment  of  CBP20 
Correlation  Coefficient:  24.7 
Rfactor  (%)  42 

Refinement 

Resolution  (A)  50-2.2 

No  of  reflections  (test  set)  51677  (2571) 

Rcrys/Rfree(%)  21.7/25.7 
Number  of  atoms  7240 
Water  Molecules  391 
Average  B  Factor  (A2)  40.2 
Rms.  bonds  (A)  0.007 
Rms.  angles  (°)  1 .6 

Percent  of  residues  in  most  favored  regions  90.0% 

Percent  of  residues  in  additionally  allowed  regions  9.4  % 


2.  Results 

The  CBC 

Figure  1  illustrates  the  triple  complex  formed  by  CBP20,  CBP80  and  the  cap  analog  m7GpppG.  The 
atomic  structure  of  CBP80  presented  here  is  identical  to  the  structure  solved  by  Mazza  et  al.  (24).  CBP80  is 
all  a  helical  and  consists  of  three  domains  connected  by  two  long  linkers.  Domains  1  (Dl)  and  3  (D3)  are 
packed  against  domain  2  (D2)  that  constitutes  the  core  of  the  protein.  The  N-terminal  domain  (Dl)  is 
structurally  similar  to  the  middle  domain  of  eIF4G  which  plays  a  regulatory  role  in  RNA  translation  and 
protein  synthesis.  Interestingly  linker  1  connecting  Dl  and  D2  contains  a  surface  exposed  proline  rich 
sequence  268-PPFTPPPH-277  (P,  conserved  residue).  Further  biochemical  experiments  are  needed  to  test 
possible  interactions  of  this  proline  rich  region  with  SH3  domains.  The  detailed  structure  of  CBP80  was 
previously  reported  by  Mazza  et  al.  (24)  and  will  not  be  the  subject  of  this  report. 

Overall  Fold  of  CBP20 

The  overall  fold  of  CBP20  (Figure  2)  is  that  of  a  classical  ribonucleotide  binding  domain  (RNP)  and 
consists  of  four  anti-parallel  (3 -sheets  packed  against  two  a-helices.  [33  (residues  81-88)  and  [31  (residues  41- 
46)  form  the  RNP1  and  RNP2  motifs  respectively.  Other  important  regions  include  the  N-terminal  region  that 
interacts  with  CBP80,  a  loop  (loop  3)  between  02  and  03  and  a  C-terminal  a-helix  (aC)  that  participate  in 
RNA  binding.  Modifications  to  the  classical  RNP  fold  include  a  long  N-terminus  and  two  insertions 
consisting  of  two  sets  of  small  anti-parallel  0  strands,  the  first  between  04  and  aC  and  the  second  at  the  C- 
terminus. 
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Cap  Binding  to  CBP20 

The  methylated  guanidine  ring  of  the  cap  analog  m7GpppG  binds  inside  a  cavity  in  CBP20  (Figure  3). 
Recognition  and  binding  of  the  cap  structure  requires  the  following:  1)  Enhanced  stacking  interactions 
between  the  electron  deficient  m7  guanine  ring  and  two  electron  rich  aromatic  residues  (Y20  and  Y43) 
(Figure  4).  2)  A  “planar”  network  of  electrostatic  interactions  (hydrogen  bonds  and  salt  bridges)  between  the 
guanosine  base  and  a  semicircular  loop  (residues  112-133)(Figure  5).  Mutagenesis  studies  in  this  region  are 
needed  to  establish  the  contribution  of  each  residue  to  cap  recognition  and  binding. 

The  mechanisms  for  the  stabilization  of  the  non-methylated  guanosine  base  of  m7GpppG  is  similar  to 
other  reported  RNP  proteins  and  is  achieved  through  hydrophobic  interactions  with  VI 34,  hydrogen  bonding 
with  main  chain  atoms  (amide  nitrogen  of  R127  and  carbonyl  oxygen  of  R129)  and  Y138. 

Role  of  CBP80  in  cap  binding  stabilization 

Izaurralde  et  al.  (15)  had  shown  previously  that  binding  of  the  cap  analog  to  CBC  is  inherently  more 
stable  than  binding  of  the  cap  to  CBP20  alone.  This  question  was  still  outstanding  after  the  publication  of  the 
structure  of  CBC  by  Mazza  et  al.  (24).  The  N-terminus  of  CBP20  (residues  6-14)  makes  extensive 
hydrophobic  and  ionic  interactions  with  residues  from  four  a-helices  (residues  31-39, 67-71, 323-335)  in 
CBP80  (Figure  6).  By  stabilizing  the  N-terminus  of  CBP20  and  the  residues  close  to  Tyr20  (stacking 
interactions  with  the  m7-guanine  ring)  CBP80  increases  the  affinity  of  CBP20  for  the  cap  structure. 

Phosphorylation  of  Threonine  79 

We  have  observed  electron  density  for  a  possible  phosphorylation  site  at  Thr79  (loop  3).  Comparison 
of  this  region  between  our  structure  and  the  structure  of  Mazza  et  al.  (24)  reveals  a  small  change  involving 
Tyr49  and  Phe50  (figure  7)  that  could  lead  to  stabilization  of  the  C-terminus  of  CBP20  (which  participates  in 
RNA  binding).  At  the  moment  this  is  just  a  speculation.  Further  experiments  are  needed  to  corroborate  this 
finding  and  to  establish  the  possible  role  of  phosphorylation  (if  any)  in  RNA  binding  to  CBP20. 

Possible  Loop3  movements  upon  cap  RNA  binding 

Further  comparisons  between  the  bound  structure  reported  here  with  the  unbound  structure  reported  by 
Mazza  et  al.  (24)  show  potential  rearrangements  in  loop  3.  Upon  RNA  binding,  Gly72  could  function  as  a 
pivot  point  for  loop  3  movements,  since  in  the  unbound  structure  Leu73  localizes  to  the  cap-binding  site 
(Figure  8).  Most  loop  3  regions  (from  sequences  whose  secondary  structure  and  topology  are  known)  vary  in 
sequence  and  number  of  residues  and  can  assume  a  wide  range  of  conformations  that  could  be  important  to 
RNA  binding.  We  have  recently  obtained  crystals  of  CBC  without  the  cap  analog.  Solving  this  structure 
should  provide  important  insights  into  the  specific  structural  changes  associated  with  cap  binding. 

A  possible  RNA  binding  region  in  CBP20 

In  RNP  proteins,  loops  1  and  3  and  the  C-terminal  aC  have  been  shown  to  participate  in  mRNA 
binding  (25-26)  A  positively  charged  groove  (Figure  9,  blue  arrows)  is  present  on  the  surface  of  CBP20  just 
inferior  to  the  non-methylated  guanosine  residue.  This  groove  is  formed  from  residues  in  loop  1  (floor),  loop  3 
and  the  C-terminal  domain  (lateral  wall)  in  CBP20  and  residues  from  CBP80  (opposing  lateral  wall  and 
pocket).  We  have  found  a  tubular  electron  density  inside  this  pocket  in  CBP80  and  we  modeled  it  with  a 
molecule  of  polyethylene  glycol  400  (PEG400). 

Is  CBC  a  de capping  enzyme? 

Residues  close  to  the  ribose  and  the  phosphates  (Gln27,  Glnl33,  and  Argl27)  in  the  cap-binding 
cavity  present  an  architecture  that  could  resemble  the  Ras/Ras-Gap  complex  (including  Gln61  from  Ras  and 
Arg789  from  RasGap).  This  finding  raises  the  possibility  that  under  certain  circumstances  (e.g.  binding  to 
another  partner), 
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CBP20  could  have  decapping  activity  by  hydrolyzing  the  gamma  phosphate  of  m7Gppp. 
Interestingly,  the  yeast  decapping  enzyme,  Dcplp  (25-kDa)  shares  15.2%  identity  with  CBP20  and  has  among 
its  conserved  residues,  Tyr68  (Tyr43  in  CBP20)  which  participates  in  cap  binding  (see  above). 

KEY  RESEARCH  ACCOMPLISHMENTS 

•  Expression  of  CBC  in  SF9  cells. 

•  Purification  and  crystallization  of  CBC. 

•  Solution  of  the  atomic  structure  of  CBC  in  complex  with  m7GpppG  at  2.2  A  resolution. 

•  Refinement  of  the  CBC  structure  with  a  final  Rfree/Rfactor  of  25.7/21 .7  and  good  geometry. 

•  Crystallization  of  the  unbound  complex 


REPORTABLE  OUTCOMES 

•  Manuscript  in  preparation: 

ATOMIC  STRUCTURE  OF  THE  NUCLEAR  CAP  BINDING  PROTEIN  (CBP20)  IN  COMPLEX  WITH 
CAP  BINDING  PROTEIN  80  (CBP80)  AND  THE  CAP  ANALOG  m7GpppG. 

G.A.  Calero,  K.F.  Wilson,  J.L.  Rios,  T.K.  Ly,  R.A.  Cerione  and  J.C.Clardy. 


CONCLUSIONS 

We  have  solved  the  structure  of  the  triple  complex  between  CBP20,  CBP80  and  the  cap  analog 
m7GpppG  at  2.2  A  resolution.  The  atomic  structure  of  this  triple  complex  represents  the  second  eukaryotic 
cap  binding  structure  solved  to  date,  the  first  being  eukaryotic  initiation  factor  4E  (eIF4E),  and  includes  key 
structural  aspects  that  were  not  found  in  the  partial  structure  for  the  CBC  published  by  Mazza  et  al  (24). 

The  fold  of  CBP20  conforms  to  a  classical  RNP  fold  and  differs  significantly  from  the  other  two  cap 
binding  proteins,  eIF4E  (8  stranded  antiparallel  P  sheets  packed  against  3  long  helices)(27)  and  the  vaccinia 
virus  cap  binding  protein  VP39  (7  stranded  p  sheets  surrounded  by  5  parallel  helices)  (28-29). 

Binding  of  the  capped  RNA  to  CBP20  is  slightly  different  from  cap  binding  by  eIF4E  and  VP39. 

Table  2  illustrates  some  of  these  differences  (29).  The  low  B-factors  observed  (35-40  A2)  and  the  excellent 
electron  density  for  the  methylated  guanosine  base  could  indicate  a  tight  interaction  with  CBP20. 

An  important  aspect  of  the  structure  of  the  CBC  is  the  possible  stabilization  of  cap  binding  by  CBP80. 
This  is  a  novel  finding  for  cap  binding  proteins  and  could  represent  an  important  regulatory  mechanism.  For 
example,  we  have  seen  that  binding  of  importin-a  to  the  N-terminal  nuclear  localization  signal  (NLS)  of 
CBP80  increases  cap  affinity  (see  previous  report).  The  structure  of  the  CBC  shows  that  residues  in  the  N- 
terminus  of  CBP80  participate  in  cap  stabilization,  therefore  interactions  of  importin-a  with  CBP80  could 
lead  to  increased  cap-binding  affinity.  On  the  other  hand,  disruption  of  the  CBC  complex  could  lead  to  the 
release  of  capped  RNA. 

Two  more  aspects  of  the  structure  deserve  comment.  First,  the  strong  electron  density  associated  with 
T79  raises  the  possibility  that  this  residue  could  be  phosphorylated.  Second,  the  supposition  that  CBP20  could 
be  a  decapping  enzyme  based  on  the  structural  similarities  between  the  GTP  binding  site  within  the  Ras/Ras- 
Gap  and  the  cap-binding  site  within  CBP20.  At  the  moment,  these  are  just  speculations  raised  by  structural 
findings  and  represent  a  clear  example  of  how  structure  can  drive  biochemistry.  Experiments  are  being 


7 


conducted  (decapping  assays)  to  investigate  the  role  of  the  CBC  in  decapping  using  the  yeast-decapping 
enzyme  Dcplp  as  a  control. 

The  structure  of  CBP20  in  complex  with  CBP80  and  m7GpppG  gives  us  new  clues  regarding  how 
CBP20  a  RNP  protein,  interacts  with  capped  RNA  and  how  CBP80  regulates  this  interaction.  We  know  that 
CBP80  is  needed  for  nuclear  export  of  the  CBC  but  the  fact  that  CBP80  is  significantly  larger  than  CBP20 
raises  the  possibility  that  CBP80  could  also  function  as  a  scaffold  for  other  proteins  that  could  interact  with 
CBP20  or  the  bound  capped  mRNA.  The  interaction  of  CBP20-CBP80  is  reminiscent  of  the  interaction 
between  eIF4E  and  eIF4G  (domain  similarity  with  CBP80).  In  this  case  eIF4G  works  as  a  binding  platform 
for  other  proteins  such  as  eIF4E,  eIF4A  (a  helicase)  and  the  poly- A  binding  protein  (30-32)  among  others. 

New  biochemical  and  structural  studies  are  needed  to  fully  understand  the  role  of  CBC  in  mRNA 
stability  and  how  this  can  be  influenced  by  signaling  events  via  EGF-R  or  Neu/Erb2.  During  the  last  six 
months  of  my  fellowship,  I  will  concentrate  on  trying  to  obtain  the  structure  of  the  unbound  CBP20-CBP80 
and  to  improve  the  crystals  for  the  CBC-importin-a  complex. 
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Figure  9 
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